Skip to content

gh-143658: Use str.lower and replace to further improve performance of importlib.metadata.Prepared.normalized#144083

Merged
hugovk merged 1 commit intopython:mainfrom
hugovk:3.15-importlib.metadata-normalized
Feb 6, 2026
Merged

gh-143658: Use str.lower and replace to further improve performance of importlib.metadata.Prepared.normalized#144083
hugovk merged 1 commit intopython:mainfrom
hugovk:3.15-importlib.metadata-normalized

Conversation

@hugovk
Copy link
Member

@hugovk hugovk commented Jan 20, 2026

Follow on from #143660 which replaced re.sub with str.translate.

@henryiii discovered that whilst that is an improvement on Python 3.10-3.11 and 3.14, the performance of str.translate is much worse on Python 3.12 and 3.13, and can be worse than the original re.sub.

Further, his fix in pypa/packaging#1064 to replace str.translate with str.lower and str.replace calls is better than both the other options across all Python versions, to varying degrees.

I benchmarked all three across all available Python Build Standalone versions from 3.10-3.15 on macOS:

image

I also benchmarked Windows and Ubuntu it's the same pattern.

Whilst the improvement isn't as large for CPython 3.15 as it is for libraries such as packaging that support a wide range of Pythons, importlib.metadata does have a backport, and I'll update python/importlib_metadata#529 once this is merged.

….metadata.Prepared.normalized

Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>
@hugovk hugovk requested review from jaraco and warsaw as code owners January 20, 2026 20:09
@hugovk hugovk added performance Performance or resource usage topic-importlib labels Jan 20, 2026
@hugovk hugovk changed the title gh-143660: Use str.lower and replace to further improve performance of importlib.metadata.Prepared.normalized gh-143658: Use str.lower and replace to further improve performance of importlib.metadata.Prepared.normalized Jan 20, 2026
@hugovk
Copy link
Member Author

hugovk commented Jan 22, 2026

Attached code for benchmarking and making the chart, and the results data from my macOS machine:

bench.zip

And charts and data of benchmarking on macOS, Ubuntu and Windows via GitHub Actions:

all-macOS-latest all-ubuntu-latest all-windows-latest

macos-ubuntu-windows.zip

Copy link
Contributor

@sergey-miryanov sergey-miryanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hugovk
Copy link
Member Author

hugovk commented Feb 6, 2026

This packaging change has been in version 26.0 for 16 days now with 306,207,835 downloads no problems, let's merge :)

@hugovk hugovk merged commit 28fb13c into python:main Feb 6, 2026
57 checks passed
@hugovk hugovk deleted the 3.15-importlib.metadata-normalized branch February 6, 2026 17:39
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 FreeBSD Refleaks 3.x (tier-3) has failed when building commit 28fb13c.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1613/builds/2796) and take a look at the build logs.
  4. Check if the failure is related to this commit (28fb13c) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1613/builds/2796

Test leaking resources:

  • test_events: memory blocks
  • test_events: references

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-freebsd.refleak/build/Lib/test/support/__init__.py", line 947, in gc_collect
    gc.collect()
    ~~~~~~~~~~^^
ResourceWarning: unclosed <socket.socket fd=9, family=2, type=1, proto=6, laddr=('127.0.0.1', 20202), raddr=('127.0.0.1', 20203)>
Task was destroyed but it is pending!
task: <Task pending name='Task-2918' coro=<BaseSelectorEventLoop._accept_connection2() done, defined at /buildbot/buildarea/3.x.ware-freebsd.refleak/build/Lib/asyncio/selector_events.py:217> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Warning -- Unraisable exception
Exception ignored while calling deallocator <function _SelectorTransport.__del__ at 0x8470965d0>:
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-freebsd.refleak/build/Lib/asyncio/selector_events.py", line 873, in __del__
    _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ResourceWarning: unclosed transport <_SelectorSocketTransport closing fd=9>
k


Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-freebsd.refleak/build/Lib/test/support/__init__.py", line 947, in gc_collect
    gc.collect()
    ~~~~~~~~~~^^
ResourceWarning: unclosed <socket.socket fd=10, family=2, type=1, proto=6, laddr=('127.0.0.1', 14643), raddr=('127.0.0.1', 14644)>
Task was destroyed but it is pending!
task: <Task pending name='Task-98' coro=<BaseSelectorEventLoop._accept_connection2() done, defined at /buildbot/buildarea/3.x.ware-freebsd.refleak/build/Lib/asyncio/selector_events.py:217> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Warning -- Unraisable exception
Exception ignored while calling deallocator <function _SelectorTransport.__del__ at 0x8428925d0>:
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-freebsd.refleak/build/Lib/asyncio/selector_events.py", line 873, in __del__
    _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ResourceWarning: unclosed transport <_SelectorSocketTransport closing fd=10>
k

thunder-coding pushed a commit to thunder-coding/cpython that referenced this pull request Feb 15, 2026
…formance of `importlib.metadata.Prepared.normalized` (python#144083)

Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>
"""
PEP 503 normalization plus dashes as underscores.
"""
# Emulates ``re.sub(r"[-_.]+", "-", name).lower()`` from PEP 503
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes this important note that links the PEP 503 prescription from the actual implementation.

# Emulates ``re.sub(r"[-_.]+", "-", name).lower()`` from PEP 503
# About 3x faster, safe since packages only support alphanumeric characters
value = name.translate(_normalize_table)
# Much faster than re.sub, and even faster than str.translate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is now out of place. It's commenting on historical content.

@jaraco
Copy link
Member

jaraco commented Mar 20, 2026

These changes also need to be ported to importlib_metadata to avoid being overwritten when syncing that repo.

@hugovk
Copy link
Member Author

hugovk commented Mar 20, 2026

This change needs to be ported to importlib_metadata to avoid getting overridden when syncing changes from that repo.

Thanks for the review, please see PR python/importlib_metadata#529, and let's fix the comments over there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage topic-importlib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants